3 research outputs found

    Sentence Boundary Detection for Social Media Text

    No full text
    The paper presents a study on automatic sentence boundary detection in social me-dia texts such as Facebook messages and Twitter micro-blogs (tweets). We explore the limitations of using existing rule-based sentence boundary detection systems on social media text, and as an alternative in-vestigate applying three machine learning algorithms (Conditional Random Fields, Naïve Bayes, and Sequential Minimal Op-timization) to the task. The systems were tested on three corpora annotated with sentence boundaries, one containing more formal English text, one consisting of tweets and Facebook posts in English, and one with tweets in code-mixed English-Hindi. The results show that Naïve Bayes and Sequential Minimal Optimization were clearly more successful than the other approaches.
    corecore